An Audio-Visual Particle Filter for Speaker Tracking on the CLEAR'06 Evaluation Dataset
نویسندگان
چکیده
We present an approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter framework. The filter performs sampled projections of 3D location hypotheses and scores them using features from both audio and video. On the video side, the features are based on foreground segmentation, multi-view face detection and upper body detection. On the audio side, the time delays of arrival between pairs of microphones are estimated with a generalized cross correlation function. In the CLEAR’06 evaluation, the system yielded a tracking accuracy (MOTA) of 71% for video-only, 55% for audio-only and 90% for combined audio-visual tracking.
منابع مشابه
Speaker Tracking Using an Audio-visual Particle Filter
We present an approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter framework. The filter performs sampled projections of 3D location hypotheses and scores them using features from both audio and video. On the video side, the features are based on foreground segmentation, multi-view fa...
متن کاملParticle Flow SMC-PHD Filter for Audio-Visual Multi-speaker Tracking
Sequential Monte Carlo probability hypothesis density (SMCPHD) filtering has been recently exploited for audio-visual (AV) based tracking of multiple speakers, where audio data are used to inform the particle distribution and propagation in the visual SMC-PHD filter. However, the performance of the AV-SMC-PHD filter can be affected by the mismatch between the proposal and the posterior distribu...
متن کاملA Speaker Tracking Algorithm Based on Audio and Visual Information Fusion Using Particle Filter
Object tracking by sensor fusion has become an active research area in recent years, but how to fuse various information in an efficient and robust way is still an open problem. This paper presents a new algorithm for tracking speaker based on audio and visual information fusion using particle filter. A closed-loop architecture with reliability of each individual tracker is adopted, and a new m...
متن کاملAudio-visual speaker tracking with importance particle filters
We present a probabilistic method for audio-visual (AV) speaker tracking, using an uncalibrated wide-angle camera and a microphone array. The algorithm fuses 2-D object shape and audio information via importance particle filters (I-PFs), allowing for the asymmetrical integration of AV information in a way that efficiently exploits the complementary features of each modality. Audio localization ...
متن کاملA Mixed-State I-Particle Filter for Multi-Camera Speaker Tracking
Tracking speakers in multi-party conversations represents an important step towards automatic analysis of meetings. In this paper, we present a probabilistic method for audio-visual (AV) speaker tracking in a multi-sensor meeting room. The algorithm fuses information coming from three uncalibrated cameras and a microphone array via a mixed-state importance particle filter, allowing for the inte...
متن کامل